Managing Computer Network Resources

ABSTRACT

Software agents are assigned goals in accordance with network policies that define a desired operational characteristic of a computer network. A software agent achieves its assigned goal by executing a predefined task. An assigned goal may be dynamically modified as necessary based on the actual operational characteristics of the network. The software agent may request further policy if it cannot achieve its assigned goal by performing the predefined task.

FIELD OF THE INVENTION

This invention relates to computer networks. In particular, it relatesto the management of computer networks.

BACKGROUND

Computer networks need to be constantly managed in order to ensuresmooth and efficient operation. Such management typically includesensuring robustness (i.e. the ability of the network to continueoperating even if nodes fail), quality of service (QoS), scalability(i.e. the network must operate regardless of the number of nodes), etc.

Typically, network management is performed by humans or is, to a largeextent, dependent on human input. This is undesirable, particularly, forexample, in the case of a network having a large number of nodes becauseof the time it would take a human to identify and fix a failed node.

It is therefore desirable that networks run themselves as much aspossible.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a schematic drawing of a system for managing a network inaccordance with the invention;

FIG. 2 shows a schematic drawing of the components of an agent runtimeenvironment;

FIG. 3 shows a schematic drawing of two agent runtime environmentsinstalled on a host device;

FIG. 4 shows a flowchart of the startup method for an agent runtimeenvironment and the operation of the agent runtime environment;

FIG. 5 shows a flowchart of an agent start-up process;

FIG. 6 shows a flowchart of the process of shutting down an agentruntime environment;

FIG. 7 shows a schematic drawing of the generic structure of an agent;

FIG. 8 shows a flow chart of the lifecycle of a simple discovery agent;

FIG. 9 shows a flow chart of the lifecycle of a simple policy agent;

FIG. 10 shows a flowchart of the lifecycle of the more complex policyagent;

FIG. 11 shows a flow chart of the lifecycle of a monitoring agent;

FIG. 12A shows a flow chart of a feedback loop setup by a monitoringagent;

FIG. 12B shows a flow chart of another embodiment of a feedback loopsetup by a monitoring agent;

FIG. 13 shows a client browser plug in activation/request sequenceaccording to one embodiment of the invention;

FIG. 14 shows a block diagram of an ARE comprising a local modeler andthree policy modelers;

FIG. 15 shows the hierarchy of the various modelers within the system;

FIG. 16 shows the configuration of the modelers of FIG. 15 in greaterdetail;

FIG. 17 shows a flow chart of the decision making process used by thevarious modelers;

FIG. 18 illustrates the operation of a policy refinery in accordancewith one embodiment of the invention;

FIG. 19 shows an implementation of a system in accordance with oneembodiment of the invention;

FIG. 20 shows an implementation of a system in accordance with anotherembodiment of the invention; and

FIG. 21 shows a diagrammatic representation of exemplary hardware forperforming aspects of the present invention.

DETAILED DESCRIPTION

The invention pertains to the management of computer networks. Accordingto one embodiment of the invention, a method of managing a computernetwork uses software agents. The software agents operate within anagent runtime environment (hereinafter referred to as an “ARE”) which ishosted on a particular network (host) device. Each ARE allows agentsoperating therein to communicate with the host device, with otheragents, and with an agent control mechanism. Each agent has at least oneassigned goal which is expressed in the form of policy and which can bedynamically modified based on desired operational characteristics of thenetwork.

System Overview

FIG. 1 shows a schematic drawing of a system 10 for managing a networkin accordance with the invention. The system 10 is vertically layeredcomprising an ARE 12 which is loaded on a host platform 14 defined on anetwork device. Operating within the ARE 12 are a number of agents. Aswill be described in greater detail below, agents may perform discovery,monitoring, and policy enforcement functions. The upper most layer ofthe system 10 is occupied by an agent control mechanism 16 whichcomprises a CCE⁺ layer 18 sitting on top of a CCE⁻ layer 20 (CCE denotesa coherent computing entity). The CCE⁺ layer 18 defines a global systemactuator comprising a Global Agent Repository and Dispatch Module 18.1,a Global Domain Policy Store Module 18.2, a Policy Authoring Tools andPolicy Test Module 18.3 and a Modeler Module 18.4. The CCE⁻ 20 defines adistributed/localized system observer comprising a Local AgentRepository and Dispatch Module 20.1, a Local Domain Policy Store Module20.2, a Local Domain Policy Request Module 20.3 and Local Agent StateManager Module 20.4.

Each layer of the system 10 masks the complexity of the layers below itand may be configured to provide guaranteed services to the layers aboveit. Thus, at one level, the system 10 may be viewed as a macro-qualityof service (QoS) infrastructure with the host platform 14 defining amicro QoS infrastructure.

Typically, the system 10 is implemented on a large network comprising anumber of network devices, each hosting an ARE. The AREs are able tocreate/instantiate, suspend operation of the agents, and terminate theagents. Each agent has one, or more, assigned goal which is expressed inthe form of policy. The module 18 provides a user interface, preferablyin the form of a Graphical User Interface (GUI) which allows policy tobe input and stored in module 18.2. Policy may be seen as the desiredoperational characteristics of a network on which system 10 isinstalled. Alternatively, policy may take the form of a Service LevelAgreement (SLA) which is a documented result of a negotiation between acustomer/client and a service provider, that specifies the levels ofavailability, serviceability, performance, operation or other attributesof the service being provided using the network. Once policy is inputinto the CCE⁺ 18 it is disseminated through the various layers until itreaches the various agents seized with the task of implementing policy.

The system 10 is dynamic in that the agents monitor various operationalcharacteristics of the network devices and report these characteristicsto the CCE⁺ 18. This information is used by the global modeler module18.4 to model network behavior. The module 18.3 allows test policy to bewritten and the effects of such test policy on the system 10 may bedetermined by using numerical modeling techniques in the module 18.4. Inthis way, policy may be tested before implementation in order todetermine optimal policy. The policy being implemented by the variousagents may then be replaced dynamically with the optimal policy. Themodule 18.1 maintains a registry of all agents in the system 10.

Because the network on which the system 10 is installed may be large, itis preferable to divide the network into a number offunctional/management domains. For example, the various departments of acompany may each form a functional domain, or discrete business unitsmay represent functional domains. The CCE⁻ 20 may be viewed as a controlmechanism for a particular functional domain. Accordingly, the LocalAgent Reporting and Dispatch Module 20.1 maintains a registry of allagents in a particular functional domain. The relevant policy is storedin module 20.2. Module 20.3 allows the CCE⁻ 20 to request domainspecific policy from the CCE⁺ 18 where appropriate. The module 20.4stores the state of each agent within a functional domain. The state ofan agent may be active or passive. Each device within a functionaldomain has a domain adapter 15 which facilitates communications betweenthe CCE⁻ 20 and the AREs in a particular functional domain.

The system 10 further includes a secure messaging service/subsystemwhich will be described in detail below.

In terms of the mechanics of policy handling and distribution within thesystem 10, the CCE⁺ 18 may be regarded as a global modeler, the CCE⁻ 20as a regional modeler, and the combination of each domain adapter 15 andits associated ARE 12 as a local modeler as discussed below. As will bedescribed in greater detail below, agents are used to monitor variousoperational parameters of the system 10 and to apply corrective policyto change these operational parameters should they deviate fromestablished limits. For example, suppose that a particular networkdevice has deviated outside its normal operational parameters. In thiscase an agent seized with the task of controlling this device will firstdetermine whether it has the necessary corrective policy to change theoperational parameters of the device. If the agent has the necessarycorrective policy then it will apply it to control the device. However,if the agent lacks the corrective policy then it will request thecorrective policy from the particular ARE within which it is operating.If this ARE does not have the corrective policy then an attempt will bemade to obtain the corrective policy from other agents and AREsoperating within the particular functional domain. Should this proveunsuccessful then a request will be made to the CCE⁻ 20 for thecorrective policy. The CCE⁻ 20 will respond by either supplying thecorrective policy to the agent or it would obtain the corrective policyfrom the CCE⁺ 18, if it does not have the corrective policy. Thus, itwill be seen that system 10 implements a competence based control flowmechanism wherein, in responding to a situation which requirescorrective action, an agent will first attempt to take the appropriatecorrective action, should it have the appropriate corrective policy. Ifthe agent lacks the corrective policy it will request it from its hostARE, or other agents within its functional domain, or the CCE⁻ 20, orthe CCE⁺ 18, as the case may be, and in the particular order as recitedabove.

Advantageously, the present invention may be employed in a wide varietyof network applications. For example, the present invention may be usedto implement self-stabilizing networks, or to perform load balancing,etc.

The various components of the system 10 are now described in greaterdetail.

The ARE

Each ARE 12 provides an open architectural framework or shell into whichexisting policies may be plugged by a network administrator, dependingon the functional domain under consideration. In various embodiments anARE 12 is implemented as a Java Virtual Machine (JVM) or an ObjectRequest Broker (ORB), or as a vendor specific interface which lacks thecapacity to host an ARE. Equally, an ARE 12 may be implemented usingremote procedure calls in a Common Object Request Broker Architecture(CORBA) framework. In some cases a host device may be unable to supportan ARE or an agent running on it. In these cases, the nearest ARE,defined as the ARE with which the shortest communication path may beestablished, is used to provide a proxy for that host device. Aparticular advantage of the present invention is that it is able tolever a JVM and/or ORB to provide an environment in which the agents canoperate independently of the underlying hardware and operating system inwhich the ARE is located. Each ARE 12 provides a uniform context and setof services that agents can rely on, thereby acting as an operatingsystem for the agents.

FIG. 2 shows a schematic representation of the components of an ARE 12hosted on a host device 30. Running within ARE 12 are a number of agents13. As can be seen, the host device 30 includes host hardware 32 whichis the physical platform on which the ARE 12 is hosted. The host device30 may be a computer, a switch, a router, a server, etc. According toone embodiment of the invention, the ARE 12 is written in C but iswrapped in Java in order to allow it to run on a Java Virtual Machine(JVM) 14. This allows the ARE 12 to run on a variety of hardware andoperating systems without having to be customized for each physicalcombination of platform and operating system.

Reference numeral 17 indicates system resources which are available tothe ARE 12 and agents 13. Typically, the system resources 17 may includefirewall systems, network interface cards, routing planes, networkprocessors, and the like. The ARE 12 further includes a set of servicelibraries 12.1 which provides the agents with access to common servicessuch as a messaging subsystem, class loader, etc. These services will bedescribed below. The ARE 12 also includes a device adaptor 12.2 whichprovides a generalized interface to a variety of device types. While theservice library 12.1 provide interfaces to common services such as acommunication layer which is available on each host device, the deviceadaptor 12.2 provides an interface to host-specific resources, such as anetwork forwarding plane. Each host device 30 has its own version of adevice adaptor 12.2 which configures a generalized device interface tospecific commands needed for that particular device.

The ARE 12 further includes a security manager 12.3 which controls agentaccess to the service libraries 12.1 and the device adaptor 12.2. Thesecurity manager 12.3 ensures that resources are only used by authorizedagents. Additionally, the security manager limits the rate at whichparticular agents may use a resource.

It is possible that more than one ARE 12 may run on a single hostdevice, as is shown in FIG. 3 in which ARE 12A and 12B are showninstalled on a JVM 14 which is running on a host device. The ARE 12A hasa unique ID which is used to identify and reference it to policies thatare to be enforced on the network. Each ARE 12A, 12B has a uniquecommunications port ID assigned to it. Port assignments for AREs areconfigurable by command line arguments or system boot configurationfiles. The command line arguments or system boot files may be in thesame format.

Policy changes are monitored by the agents 13 which listen on portranges which are also dynamically controlled. The security manager 12.3is responsible for encryption/decryption of secured connections withinthe system 10. Each ARE 12 uses a cryptographic subsystem which may beselected by a system administrator using a configuration manager or aconfiguration file loaded locally on an ARE 12.

ARE-to-agent, agent-to-agent, and ARE-to-ARE communication is performedvia a messaging service. The messaging service assumes that both theagents and the AREs are able to decipher the messages that are addressedto it. Details of the messaging service are provided later.

As part of the service libraries 12.1, each ARE 12 supplies to each ofits agents a set of utility classes which are usable by the agentsrunning on the specific ARE 12. However, before using the utilityclasses, an agent is required to have permission from the securitymanager 12.3.

Each ARE 12 facilitates initialization of the agents 13 by running anauto configuration discovery process. This process requires each ARE tobroadcast (upon its startup) a registration request to a topologyservice (which will be described in detail below) which responds to therequest with the necessary information for the ARE to communicate with anetwork configuration server, policy server, etc. (as will be describedin greater detail below).

Each ARE 12 is responsible for locating a route to various topologyservers running the topology service via a multicast boot configurationservice. This service provides information on how to find criticalservices required to operate an ARE.

FIG. 4 shows a flow chart of a startup method for an ARE 12. The startupmethod is executed by the device hosting the ARE 12. Referring to FIG.4, at block 40 an ARE 12 configures a logging process used to log systemmessages, errors, etc. At block 42, the ARE checks for a Startup AgentList which is a list of agents, such as may be provided on the commandline to the ARE at startup. If such a list exists, agents on the listare started one at a time by executing a start-agent process whichincludes a loop comprising starting an agent at block 46 and checkingfor another agent in the list at block 44 until all agents in the listhave been started. If no Startup Agent List exists, the ARE assumes itis to run in a stand-alone mode i.e. the ARE assumes that networkresources are unavailable and sets itself up to run autonomously. Atblock 48 the ARE broadcasts a multi-cast registration message to anytopology server to register itself and obtain messaging informationnecessary to communicate with the configuration manager. This broadcastincludes the ID of the particular ARE, the network address of the hostmachine on which the ARE is running, and the port number used forcommunications by the ARE. At block 50 the ARE checks to see if it hasreceived a Startup Reply from a topology server. If no Startup Reply isreceived, at block 58 a check is performed to see if a predeterminedtime within which the Startup Reply is expected has expired. If aStartup Reply is received, at block 52 a determination is made as towhether the reply includes a Startup Agent List. If it does, a loop isentered comprising starting an agent at block 56 and rechecking the listat block 54 until all agents in the list have been started. If at block58 the predetermined time has not expired, at block 60 the ARE waits orsleeps for the unexpired portion of the predetermined time. If thepredetermined time has expired, at block 62 a check is performed to seeif a predetermined maximum number of broadcast retries have beenexceeded. If not, block 48 is re-executed. If the predetermined maximumnumber of broadcast retries has been exceeded, at block 64 the ARElistens for messages. At block 66, the ARE determines whether a receivedmessage is a Shutdown message. If the received message is a Shutdownmessage, at block 68 the ARE shuts itself down; otherwise, at block 70the ARE processes the message and repeats the process starting at block64.

The processing represented by block 70 includes:

(a) maintaining a list of all internal resources and services availableto it and the agents operating within. This includes all agents that arecurrently running, proxied, or suspended;

(b) keeping a systems history log for each agent started;

(c) routing messages to and from agents it hosts. Each ARE 2 understandswhen an agent is referring to a local external agent residing on anotherhost as opposed to an agent already running in the ARE 12. Accordingly,each ARE knows how to route messages between the various agents;

(d) performing start-up services required for proper agent operationprior to loading/instantiating/running any policy agent (as will bedescribed below);

(e) starting the necessary agents and services upon a request from anauthorized local or external agent requiring it to instantiate aparticular agent not already instantiated. The particular ARE in whichthe requesting agent is running performs a check via its securitymanager to see if the requesting agent has permission to access thistype of agent/service on either the current ARE or on a remote ARE;

(f) keeping a master thread group that starts, stops and resets agentswithin its control;

(g) requesting other external thread groups on other AREs to start, stopand reset proxied agents only if they have the appropriateauthorization. A master thread group within each ARE 12 has the abilityto control the priority of each agent in a thread group. This priorityis set by the master thread group depending on the type of agent beingrun and the policies being enforced by that agent. The master threadgroup starts up everything within an ARE. This means that all agents andservices started within the ARE are boot strapped using the masterthread group. The master thread group will pass into each thread groupstandard environmental settings to run within the thread group—Threadutilization is administered by a subsystem called a proxy thread managernot shown that acts as a thread controller and mediator for interactionbetween agents and threads. Each agent can be pacified, by the proxythread manager, and it's thread made available to active agents in theevent of such pacification. In other instances agents can simply returna quiet thread to the proxy thread manager pool for use by other agents;and

(h) suspending an agent after the agent has registered itself with theARE, in order to conserve thread and resource usage since an agent doesnot need to be active until required, and reinstating the agent when theagent's services are required.

FIG. 5 shows a flowchart of an agent startup process executed by an ARE.In one embodiment the agents are defined by agent code written in Java,wherein the Class construct in Java is used to define an Agent Class.Referring to FIG. 5, at block 80, the AREs 12 load the agent code, andany code the agent code is dependent on, into memory. During loading andat block 82, the security manager 12.3 verifies that access to the agentcode is allowed. If the ARE does not have permission to access the agentcode, or other code on which the agent code is dependent, then asecurity error is generated at bock 84 and the ARE 12 logs an agent loadfailure. If no security error is generated, at bock 86 an agentinitialization process is called prior to calling an execution process.The agent initialization process allows any initialization tasks to becompleted before execution of the agent code begins. Error messagesgenerated during agent initialization is detected at block 88 and atblock 84 an agent initialization failure is logged. If no error messagesare generated during agent initialization at block 90, the ARE 12allocates a new thread of execution for an agent and executes athreat-start process, which executes the agent code. Each agent executesits own thread, which is separate from the main thread of the ARE andany other agents. At block 92 the agent is added to a registration tablein the ARE 12.

The process of shutting down an ARE is shown in FIG. 6. Referring toFIG. 6, at block 100 a check is performed to see if there are any agentsleft in the registration table. If no agents are left the agent shutdownprocess is complete and the ARE terminates. Otherwise, at block 102 anagent stop process is called which stops execution of agents. At block104 details of the agent whose execution was stopped are removed fromthe registration table. Returning at block 100, the shutdown process isrepeated until there are no agents left in the registration table.

Agents

As used herein the term “agent” denotes a program that performs sometype of operation, which may be information gathering or some processingtask, in the background. A foreground process is one that accepts inputfrom a keyboard, mouse, or other input device whereas a backgroundprocess cannot accept interactive input from a user, but can access datastored on a disk and write data to a display. In particular, a softwareagent is a virtual entity which: (a) is capable of acting in a runtimeenvironment; (b) can communicate directly with other agents (messaging);(c) is driven by a set of tendencies (expressed in the form ofindividual objectives or of a satisfaction/survival function which ittries to optimize or policy); (d) possesses resources of its own (logicand algorithms); (e) is capable of perceiving its environment (state);(f) has only a partial representation of its environment; and (g) isable to reproduce/clone itself.

An agent tends to satisfy its policy directives taking account of theresources and logic available to it, and depending on its perception ofits state, its representations, and the communications it receives.

According to embodiments of the invention, agents may be simple (orstatic and narrow in scope). Alternatively an agent may be complex(mobile, intelligent and autonomous). A simple or static agent is onewhich is always located in a particular ARE. In other words, it isstatic within the environment of the ARE. On the other hand, a complexor mobile agent is one which is trusted to migrate to other AREs hostedby other network devices.

According to embodiments of the invention, agents may be autonomousand/or intelligent. Both intelligent and autonomous agents have a degreeof sophistication which allows them to algorithmically interpret thestate of their environment and to perform tasks independently withouthuman interaction. They possess representations of their environment andinference mechanisms which allow them to function independently of otheragents. The difference between intelligent and autonomous agents lies inthe scope of the representation of the environment that they possess, anintelligent agent having a greater representation of its environmentthan an autonomous agent. Accordingly, an intelligent agent is able tosolve complex problems without reference to outside entities whereas asan autonomous agent is able to independently solve complex problems upto a certain level beyond which the autonomous agent will have torequest further policy or data from an outside entity such as the agentcontrol mechanism 16 of FIG. 1.

As discussed above, mobile agents can migrate under their own controlfrom machine to machine in a heterogonous network. However, mobileagents cannot act on their own on any platform without externalactivation messages. In other words, a mobile agent can suspend itsexecution at any arbitrary point, move to another machine and resumeexecution there only after receiving a dispatch message from the agentcontrol mechanism 16. The primary reason for this is that a mobile agentis not provisioned to act upon its new environment and must thus waitfor the necessary algorithms and logic to be received before migrating.

As will be appreciated, mobility is a powerful attribute for an agent asit allows an agent to perform tasks such as information gathering in anetwork which includes a distributed collection of informationresources. Further, by being able to migrate to a particular networklocation, a mobile agent eliminates all intermediate data transfer andcan access a resource efficiently even if the resource provides only lowlevel primitives for working within it. This is beneficial, particularlyin a low bandwidth network where instead of moving the data to acomputational resource it is more practical to move the computationalresource to the data.

Mobile intelligent agents are able to migrate under their own controlfrom machine to machine in a heterogonous network without waiting forexternal activation messages.

A further attribute of intelligent and autonomous agents is that theyhave the capacity to anticipate future events and to prepare for them.These agents are able to use a capacity for algorithmically inducedreasoning based on representations of the environment to memorizesituational parameters (data points), analyze them, and requestadditional algorithmic components (policy from the agent controlmechanism). In the event that there is a conflict in goals between theseagents, these agents are able to negotiate among themselves to determinewhich goals/policy are more relevant to satisfy needs.

Agent Structure

FIG. 7 shows the broad generic structure of an agent 110 according toone embodiment of the invention. The agent 110 includes a messagingadapter 112 which provides the functionality of sending messages to andreceiving messages from other agents, AREs and components of the agentcontrol mechanism 16. Sending messages is a specific type of action forwhich there is policy. On the other hand, receiving messages isimplemented as a sensing process which models a receive-queue.

The messaging adaptor 112 may be implemented, for example, using theTransport Layer Security (TLS) protocol. One advantage of TLS is that itis application protocol independent. Higher level protocols can layer ontop of the TLS Protocol transparently.

The agent 110 further includes a policy repository 114 which containsagent specific policy that is necessary for the operation of the agent110. This policy is obtained from the policy stores of the CCE⁺ 18 andthe CCE⁻ 20.

The policy repository 114 of agent 110 may be implemented using aLightweight Directory Access Protocol well-known (LDAP) directorystructure. LDAP uses LDAP directories to implement Public KeyInfrastructure (PKI) security. From a user's point of view, LDAPprovides the directory in which the certificates of other users arefound, enabling secure communication. From an administrator's point ofview, LDAP directories are the way in which certificates can becentrally deployed and managed.

Alternatively, a Policy Knowledge Base (PKB) model may be used. PKBpolicy reporting offers three types of interface services: assertionalservices, retrieval services, and active information services.Assertional services allow agents to assert new beliefs (policy) intothe knowledge base, (i.e., to create instances of policy, and to changethe values of attributes of existing policy instances). Retrievalservices provide access to policy that is actually stored in the agentknowledge base. Active information services offer access to informationin the knowledge base upon demand. Requesting an active informationservice starts a monitoring process that recognizes specific changes inthe knowledge base and sends information about these changes to therequesting process, i.e., to specific control layers.

The agent 110 further includes a policy repository 116. In oneembodiment all agents are instances of an Agent Class described in Java.A number of methods are written for the Agent Class to add functionalityto agents based on the policy the agent is required to implement. Themethod repository 116 is a store of all such methods for an agent 110.

The agent 110 further includes a control unit 118. The control unit 118executes policy using relevant methods from the policy repository 116 tocalibrate, execute, activate, respond and deactivate specific actionsbased on policy.

Atomic actions are actions whose execution is started and then eithersucceeds or fails. The execution of continuous actions initiates acontrol process which will run until explicitly finished, suspended, ordeactivated. Examples of these types of actions are actions such asactivating a control algorithm to make a router shut down a port upondetecting attack.

Included in the agent control unit 118 is a method requestor/dispatcherwhich is defined by a generic “Layer” object. The main method of theLayer Object describes a “sense-recognize-classify-invoke method” cycle.In each loop of this cycle the current policies of the agent 110 arescanned for new relevant situations relative to a current system state.These situations, together with the messages received from the nextlower layer (which defines the domain adapters 15) are used to computenew options for the agent 110. These options are then passed to the CCE⁻20 (or CCE⁺ 18 if necessary), which is responsible for making decisions(i.e., for selecting appropriate operational primitives and for updatingthe policy structure) and for scheduling and executing those decisionswhich are expressed in the form of new policy. In particular, thismechanism decides whether a layer will deal with a specific goal problemby itself, or whether it will pass the goal up to the next highercontrol layer. In the former case, the layer decides what new policy tomake in order to achieve the goal and schedules the new policy forexecution. The execution of new policy is linked to the situationrecognition process: when an action is executed, a reactor is activatedwhich monitors the expected effects of the action.

The fundamental activity cycle comprises updating policy, situationrecognition, goal activation, a generation of situation-goal pairs asinput a planning, scheduling, and execution process, a competence checkincluding the set of situations and goals into a subset for which thelayer is competent and for which plans are generated and executed by thelayer, and into another set for which the layer is not competent, andfor which control is shifted to the next higher layer.

As shown in FIG. 7, agent 110 communicates with a domain adapter 115which may be represented as an object providing methods for sensorcalibration, enabling and disabling sensor activity, and for readingcurrent values relevant to the respective domain.

Current sensory values are made available to the agent control unit 118by a perception buffer from which the values of sensors can be read, andwhich can be explicitly updated. The perception buffer itself calls themethods offered by the individual sensors, e.g., in order to read thecurrent sensory information.

There is a flow of information between the policy repository 114 and theAREs, the CCE⁻ 20, and the CCE⁺ 18. New policy derived from an agent'sperception is sent to the CCE⁻ 20 and entered into a PKB located in theLocal Domain Policy Store 20.2. In the event the CCE⁻ 18 determines thatan ARE operating within its functional domain needs to be provisionedwith a new policy, it “pushes” to a PKB in the relevant ARE which then“pushes” the new policy to specific agents if necessary.

Each agent control unit 118 continuously accesses information from therepository 114. The situation recognition processes at the differentlayers of system 10 evaluates environmental parameters in order todetermine whether new policy is required. This involves a planningprocess which evaluates preconditions of actions.

According to other embodiments of the invention, an agent control unit118 may modify the policy repository 114 by signaling first to itsassociated ARE, and subsequently to the CCE⁻ 20, a demand for newpolicies. The derivation of new policy, which is known as knowledgeabstraction may be anchored either in the PKB or in the control unit ofthe agent. The former alternative describes the knowledge base of anagent as an active, blackboard-like system, whereas the latteralternative corresponds to the view of a classic Artificial Intelligence(AI) planning system. The performed alternative depends on the power ofthe inferential services provided by the knowledge base in repository114.

General Characteristics of Agents

As will be appreciated various implementations of the agents arepossible, however each of the agents will have the following generalcharacteristics:

(a) agents assume a secured environment already exists wherever they run(there is only one Security Manager per ARE);

(b) each agent assumes certain services (such as a Logging Component,Security Manager, Device Adaptor, class loader, etc . . . ) areavailable to it from the ARE.

(c) agents assume there is a (limited) statically defined relationshiphierarchy between other agents at runtime which are defined in part bythe messaging service, modelers, policies, and a topology service;

(d) agents assume that all system resources are freely available and ifa resource is available to an agent, the agent has permission to use theresource unless otherwise restricted by the security manager 12.3;

(e) each agent has a unique ID within the ARE in which it operates andwhich, when coupled with the ARE's unique ID, defines a unique agentname; a Globally Unique ID (GUID) generator generates an Agents uniqueID.

(f) each agent maintains information regarding its management status andthe management domain under which it is being administered;

-   -   (g) an agent's management domain state may be defined in one of        three ways, viz. Controlling, Subordinate, or Not Applicable        (NA). With Controlling, the agent acts as a Policy Decision        Point (PDP) i.e. a logical entity that makes policy decisions        for itself or for other network elements that request such        decisions. With Subordinate, the agent refers the policy        decision point to another agent. With Not Applicable, the agent        ignores any policy decision requests made to it by the system        10;

(h) each agent is assigned a named thread group upon creation;

(i) each agent is passed shared variables used to facilitateInter-Process Communications (IPC) messaging transfer between a mastermessage queue and each agent's message queue;

(j) each agent upon instantiation is assigned a thread priority level bythe ARE, which will be determined by policy and agent class type beinginstantiated;

(k) agents register with the local ARE when instantiated; and

(l) agents have the ability to be passivated and/or re-activated by theARE.

As discussed above each ARE has the ability to provide feedback on theper-thread utilization of each thread running within it. Thisinformation is used by the CCE⁻ 20 to, for example, help determine loadbalancing on the ARE's.

Agent Behavior

Agents are used to enforce specific actions/behavior onhardware/software devices controlled by the system 10. Agents have botha policy-enforcement-point (PEP) i.e. a logical entity that enforcespolicy decisions, and a policy-decision-point (PDP) aspect. Incombination with the local modelers, each agent may be viewed as a PDP.Agents use an event notification system whereby policies are setup asalerts tote monitored within the local modelers, which notifies theappropriate agents that a policy SLA is not within its desiredoperational range. Agents use a two stage feedback mechanism, whichlinks the local modelers to the agents to provide the alert notificationfeedback loop required to monitor device/application evaluation andadminister corrective actions within the network for a given networkstate.

Agents depend on each ARE to route messages within an ARE to otheragents either within the same ARE or on other AREs, which can be eitherrunning on either the same or different machine. Agents are able torequest both local and remote network resources that are registeredwithin an ARE. Agents rely on their associated local modelers toreplicate fault tolerance behavior and state information to other localmodelers running on different machines. This is used in conjunction withthe topology service to replicate state across the functional domains ofsystem 10.

When agents initialize, they register themselves with their associatedARE and local modeler. Each agent registers its policy/SLA requirements.The local modeler then sets up a listener to listen for a particularpolling agents ID, and relays its feedback information with the localmodeler. The agent's action commands are called by the local modelerwhen information from a monitoring agent indicates the agent'spolicy/SLA is no longer within its operational range.

Agents in conjunction with the associated local modelers set up feedbackprocessing loops to monitor policy within the device being controlled.

In order for an agent to implement policy it may need to requestservices that are not currently running or loaded on the local ARE host.If the local ARE cannot support such an agent request, a request to thetopology service is made to find a remote ARE that is either running orcan run the required service. If the security check passes for therequesting service, the service is either started on the remote ARE orthe ARE attaches to the existing remote service. The remote ARE thenproxies back the service to the local ARE requesting the service. If therequesting service cannot be started anywhere within the system 10, theagent sends back an exception indicating the required service could notbe started. This exception is sent along a system alert handlinginterface developed for the AREs.

When an agent loads a device adapter, it makes a security access call toverify that (a) the classes or libraries needed to communicate andcontrol the device to be managed haven't been altered and (b) the agentsdevice adapter has rights to resources the classes and libraries areaccessing.

The means of communication between ARE-to-Agents, Agent-to-Agent, andARE-to-ARE is the messaging service. Each sending agent has to determineif a receiving agent is receiving and acknowledging the sending agent'srequest.

Once the agents have set up the initial policies and configurations theyare then passivated by the ARE (i.e. the agent's thread is suspended) soas to conserve resources within the ARE.

Having broadly described generic agents in the system 10, specificembodiments of agents will now be described.

Discovery Agents

The system 10 includes discovery agents which are used to examine anddetermine the capability of a network device (host) and the equipment orcomponents of the device. Each discovery agent then reports thisinformation to a topology server so that the system 10 can use theinformation to construct a topological representation of the network.

A simple or static discovery agent might, for example, simply startexecuting when a device is first powered up and thereafter will look forany new device components or changes in capability to existingcomponents, report the information to the topology server and thereafterterminate. This allows the system 10 to know about equipment changes,such as the adding of a new processor, more memory, or a larger disk toa device. Discovery agents may be written to check for new components orcapabilities deemed important to the operation of the devices.

The life cycle of a simple discovery agent is shown in FIG. 8. Referringto FIG. 8, at 130 a configure logging process is performed. At 132 thediscovery agent performs the discovery process. At 134 the agent reportsthe discovered configuration to a topology server and at 136 thediscovery agent unregisters the ARE.

Policy Agents

The system 10 also includes policy agents which are used to enforce orimplement specific policy. Policies defined in the system 10 are parsedby a policy agent and the agent configures/controls a device in order toenforce the desired policy. Policy enforcement often involvesconfiguring a device to behave in a specific way and thereafter simplyletting the device operate. However, policies may be more complex andmay require dynamic control of a device. Policy agents operate on localdevices (devices running an ARE in which the policy agent is executing)or remote devices (devices which cannot run an ARE for agents to executein).

FIG. 9 shows a lifecycle of a simple policy agent, which configures adevice and thereafter lets the device run. Referring to FIG. 9, at 140 aconfigure logging process is performed. At 142 the policy agent requestsor receives policy parameters from a policy server. Thereafter at 144 adetermination is made as to whether the parameters have been received.At 146 if the parameters have been received then the device isconfigured to operate in accordance with the policy parameters. If nopolicy parameters are received or after step 146 has been performed,then at step 148 the policy agent unregisters from the ARE.

FIG. 10 shows a flowchart of a more complex policy agent which works inconjunction with monitoring agents (see below) and other policy agents.This type of policy agent continuously monitors for feedback frommonitoring agents and adjusts the policy enforcement settings for adevice based on such feedback. Referring to FIG. 10, at 150 a configurelogging process is performed. At 152 the agent requests/receives policyparameters from a policy server. At 154 a determination is made as towhether the policy parameters have been received. If the parameters havebeen received then at 156 the agent configures the device to operate inaccordance with the policy parameters. Thereafter, at 158 the agentlistens for messages from a monitoring agent and at 160 a receivedmessage is parsed to check if it is an agent shutdown message. If it isa shutdown message then at 164 the agent is unregistered from the ARE.If the message is not a shutdown message then it may be a new policy or“monitor results” message which is parsed at 162 and step 156 comprisingconfiguring the device in accordance with the new policy parameters isperformed.

Thus, the complex policy agent goes beyond configuring a device andletting it run and adds a level of dynamic control at the policyenforcement point. The policy agent code itself can become part of theoperating policy, shifting between specified policy enforcement settingsbased on the observed dynamics of a network as reported by associatedmonitoring agents.

Monitoring Agents

The system 10 further includes monitoring agents to monitor variousaspects of the devices in the network. Monitoring agents are able tomonitor local devices (i.e. devices running an ARE in which themonitoring agent is executing) or remote devices (i.e. devices which arenot able to run an ARE). Each monitoring agent has monitoring policywhich determines the operation or set of operations it should perform tomonitor the device, and a set of thresholds, within which the monitoringresults should fall. When monitoring results are found to lie outsidethe specified threshold range, the monitoring agents report the event tothe system 10. Complex monitoring agents may be written to combinemonitored values in a variety of ways using threshold ranges, timeperiods, rates of change, etc. A hierarchy of monitoring agents can alsobe used in order to monitor inter-dependent devices, or a sequence/pathof devices.

A typical monitoring agent has a lifecycle with a flow of executionsimilar to that shown in FIG. 11. Referring to FIG. 11, at 170 aconfigure logging process is performed. At 172, the monitoring agentrequests/receives monitoring parameters from a configuration manager. At174 a determination is made as to whether or not the monitoringparameters have been received. If the parameters have been received thenat 176 the monitoring agent performs the monitoring operation.Thereafter, at 178 a determination is made as to whether the monitoringresults lie outside a specified threshold. If the results lie outsidethe threshold then at 180 this is reported to a policy agent and/or CCE⁻20 (see below). If the monitored results fall within the specifiedthreshold then at 182 the monitoring agent sleeps or lies dormant for aspecified time (based on a monitoring frequency) after which, at 184 acheck is performed to see if new monitoring parameters have beenreceived in the form of a new message. Depending on the results of thecheck, step 176 may be performed again. At 186 a determination is madeas to whether the new message is a message to shutdown the monitoringagent. If the message is a shutdown message then at 188 the monitoringagent unregisters from the ARE.

Detailed Description of Device Adapter Interface 12.2

Each device adapter 12.2 defines a boundary that separates genericsystem calls from device specific functional calls made upon aparticular device(s)/application(s). Both the policy agents and themonitoring agents may make calls to a device adapter 12.2 provided thatthe functionality of the command and control and monitoring feature setsdo not overlap. Each device adapter 12.2 has two types of API calls forthe command and control features. Both generic Traffic Engineering (TE)and Application Engineering (AE) calls are used to control devices andapplications within the system 10. Each device adapter 12.2 includesdevice specific drivers which are loadable and unloadable at runtime.

In use, the device adapters “Ping” a device and maintain an opencommunications channel with the device to make sure a connection withthe device does not time out. Monitoring agents are used to achievethis.

Appropriate classes or components needed in order to implement apolicy-configuration described within a policy are loaded into eachdevice adapter 12.2 at runtime. Each device adapter 12.2 has a commandthat signals to the device the device adapter is ready to initiate adevice command sequence and a command used to signal to the device thedevice adapter has finished a device command sequence.

At any point in a communication between a device adapter 12.2 and adevice to be managed, an agent may restart and reload class or librarydrivers to passivate, then reactivate, the connection between the deviceadapter and the device. In response to an agent request to reactivatethe connection, the device adapter 12.2 internally calls a process whichpassivates communication between the device adapter and the device andprocess which resets the device adapter.

An agent may query the device that is being managed by a device adapter12.2 to obtain the current device state depending on what the lag timeis between requesting the information from the device and actuallyreceiving it. The agent requesting the information will get differentdegrees of information back depending on the device information callinglevel requested. The calls made between each device adapter and thedevice being managed are preferably secure calls.

Each device adapter 12.2 includes a device adapter toolkit comprising aCommand and Control API interface and reference implementations of aHyper Text Transfer Protocol (HTTP), Simple Network Management Protocol(SNMP), Secure Shell (SSH), and a plain-text device adapter. The deviceadapter toolkit provides a general interface for management functionsused to control host devices.

According to one embodiment of the invention, monitoring agent API callsare designed to setup feedback loops between a monitoring agent and adevice or application which facilitates monitoring compliance withpolicy within the system 10.

The monitoring agent API assumes there will be some lowest commondenominator of communication used for each type of device monitored, beit hardware or software. An example would be to use the SNMP protocolfor hardware/software devices. Alternatively, a software solution inwhich an SNMP library is compiled into the application may be used.

There are four types of monitoring within the system 10. These areactive (push), active (pull), passive (pull), and passive proxymonitoring.

With active (push) monitoring a device or application would activelybroadcast or send information regarding its state to other system 10components. In one embodiment of the invention, a call is made to adevice to startup an application loaded therein which would begin thebroadcast which is received and understood by a monitoring agent. Thus,there is an active agent running on the device sending information outin a controlled and timely manner.

With active (pull) monitoring, the agent itself is not passivated andcontinues to make request/response types of communication between thedevice adapter and the device/application to be monitored.

With passive (pull), the monitoring agent listens for SNMP traps from aparticular device(s). The monitoring agent does not actively setup afeedback loop.

With passive-proxy monitoring, a monitoring agent acts throughnon-direct means to gamer information about the device and itsstatistics by interrogating neighboring devices.

Each monitoring agent connection has two or more parties involved ineach feedback loop between a device 192 and application 194 as can beseen from FIG. 12 A. The monitoring agents are referred to as listeners190 and the device 192/application 194 sending the feedback informationis referred to as a sender. There can be more than one listenerlistening to a sender's transmission i.e. there is a many-to-onerelationship between the listener and a sender.

Each monitoring agent creates the feedback loop and as already describedhas the ability to alert the local modelers to monitored responses.These responses are called policy events and are used to detect anyaberrant behavior based on operational ranges specified by the policy.Each local modeler has the ability to take immediate corrective action.

Each device within the system 10 accommodates a minimal set of feedbackAPI's. Devices not able to accommodate this set can be proxied by thedevice adapter (Listener) into a request/response type of arrangement.This implies there is at least some means of querying the device to bemonitored. An example is a device that has a simple web serverrequest/response interface built into it. In rare instances, a devicemay not be capable of any direct response. If this should arise, themeasurement may be inferred through indirect measurement by a proxieddevice. Sampled feedback API calls include createListener( );createSender( ); startListener( ); startSender( ); stopSender( );resetListener( ); and resetSender( ), all of which are methods in Javaaccording to one embodiment of the invention.

According to one embodiment of the invention, in setting up a feedbackloop between the device adapter and a device/application, the monitoringagent assumes the device has an active agent running upon it which sendsout information to its associated local modeler. According to anotherembodiment, it is assumed that the device adapter has the capacity totalk to the device in question using the request/response metaphor toglean information from the monitored device. This information is thenintercepted and sent to the relevant local modeler. According to yetanother embodiment, where there are no active senders (code) on thedevice/application side, information is inferred by looking at variousdevices that have active and request/response capabilities around thedevice to be monitored. By looking at information going into and out ofdevices that surround the device that is being monitored, informationabout said device may be inferred. This embodiment is shown in FIG. 12Bof the drawings. Referring to FIG. 12B, the various devices around thedevice to be monitored function as a proxy sender 196, which providesfeedback messages to a listener/monitoring agent 190. The feedback isbased on communication between an application 194 and a device 192.

The protocol used to setup the feedback loop is dependent on the deviceadapter 12.2 and the method used by the device adapter 12.2 tocommunicate with the device/application. The monitoring agent assumesthe communication between the device adapter 12.2 and device is secureand private.

A monitoring agent may monitor more than one feature on a device. Themonitoring agent may then aggregate a device's polling attributestogether into one larger call.

In some embodiments, the monitoring agents comprise two separate agentsrunning sending information between each other or it may be physicallythe same monitoring agent acting as both sender/Listener.

Messaging Service

The system 10 further includes a messaging service which defines amessage bus subsystem which facilitates secure communication between thevarious system 10 components. The message bus provides for two types ofmessaging models, publish-and-subscribe and point-to-point queuing, eachof which defines a messaging domain.

In the simplest sense, publish-and-subscribe is intended for one-to-manybroadcast of messages, while point-to-point is intended for one-to-onedelivery of messages.

Publish-and-Subscribe

In a publish-and-subscribe model, one publisher can send a message tomany subscribers through a virtual channel called a topic. A topic maybe regarded as a mini-message broker that gathers and distributesmessages addressed to it. By relying on the topic as an intermediary,message publishers are kept independent of subscribers and vice-versa. Atopic automatically adapts as both publishers and subscribers come andgo. According to one embodiment of the invention, publishers andsubscribers are active when the Java objects that represent them exist.Subscribers are the components which are subscribed to receive messagesin a topic. Any messages addressed to a topic are delivered to all theTopic's subscribers. The publish-and-subscribe messaging model defines a“push-based” model, where messages are automatically broadcast tosubscribers without them having to request or poll the topic for newmessages. In the publish-and-subscribe messaging model the publishersending the message is not dependent on the subscribers receiving themessages. Optionally, the components that use this model can establishdurable subscriptions that allow components to disconnect and laterreconnect and collect messages that were published while they weredisconnected.

Point-to-Point

In a point-to-point messaging model components are allowed to send andreceive messages synchronously as well as asynchronously via virtualchannels known as queues. A queue may have multiple receivers. However,only one receiver may consume each message.

Anatomy of a Message

Each message has two parts comprising: the message data itself, calledthe payload or message body, and the message headers and properties.

Messages types are defined by the payload they carry. The payload itselfmay be very structured, as with the StreamMessage and MapMessageobjects, or fairly unstructured, as with TextMessages, ObjectMessages,and ByteMessage types (see below). Messages can carry important data orsimply serve as notification of events in the system 10.

Message headers provide metadata about messages describing who or whatcreated the message, when it was created, how long the data is validetc. The header also contains routing information that describes thedestination of the message (topic or queue), and how a message should beacknowledged.

Each message has a set of standard headers. According to one embodiment,a message header may contain the following fields:

Header Fields Set By Destination Send Method DeliveryMode Send MethodExpiration Send Method Priority Send Method MessageID Send MethodTimestamp Send Method CorrelationID Component ReplyTo Component TypeComponent Redelivered Component

In addition to headers, messages can carry properties that can bedefined and set by a message client. Message clients can choose toreceive messages based on the value of certain headers and properties,using a special filtering mechanism called a Message Selector (seebelow).

A message selector allows a component to specify, by message header, themessages it is to receive. Only messages whose headers and propertiesmatch the selector are delivered.

Message selectors cannot reference message body values. A messageselector matches a message when the selector evaluates to true when themessage's header field and property values are substituted for theircorresponding identifiers in the selector.

Message Bus Structure

The message bus provides a consistent, secured, and stateful/statelessmessaging capabilities to the various services and components within thesystem 10. The message bus provides a mechanism that allows messages tobe queued between components (State-fullness) and provides aclassification mechanism so to allocate priority levels for specificmessages. If a component is restarted, the component can requestprevious messages, so as to get a history and be in synchronization withthe rest of the system 10.

The message bus supports one-way messaging such as multicast andbroadcasts at the socket/port level, in addition to thepublish-and-subscribe messaging model.

According to one embodiment of the invention, the message bus listens onspecific ports for User Datagram Protocol (UDP), Transmission ControlProtocol (TCP/IP), and Internet Inter-ORB Protocol (IIOP) traffic.

The message bus implements a client library that other componentscompile with to communicate with the publish-and-subscribe or one-waymessaging models.

Each message within the message bus has a unique ID, derived in partfrom the Media Access Control (MAC) address of a network card used by amessage bus server.

According to one embodiment of the invention, a message on the messagebus has its payload encrypted. The header within the Message Bus messagethen specifies whether a payload is encrypted or not. To furtherincrease security, the message is sent on multiple ports or channelswhich are randomly changed.

The header within a message on the message bus contains informationregarding a message digest (MD5). The digest is used to verify a messagehas not been modified during transmission.

The message bus saves its current message state a data storage. Thisdata storage provides persistence and fault-tolerance to the messagingsystem.

The message bus groups messages together into specific groups (topics).

According to one embodiment of the invention, there is a single root forthe message bus which defines a virtual root used to the message bushierarchy. Each topic then branches off this virtual root.

Message groups (Topics) within the message bus define a hierarchicaltree. Topics then may have other topics underneath them as child nodesin the tree.

Message Types

The number of message types that can be defined for use with system 10are limitless. However, a few of the message types used in embodimentsof system 10 are described below:

Policy Event messages are used to communicate changes of state withinthe system 10. This type of message is used to transport policiesthroughout the system 10. The Policy Event messages carry within themall the information needed to apply/run a policy on an agent. PolicyEvent messages have state and are stored and forwarded on todestinations. Policy Event messages assume the message bus guaranteesdelivery of the message. Policy Event messages have a unique identifierin the system so that the system can be sure that the Policy Eventmessages sent have not been tampered with. According to one embodimentof the invention, this identifier could be in the form of a digitalsignature. Policy Event messages are sent in a secured (SSL) manner onmessage bus according to one embodiment of the invention.

Security Policy Event messages are a subset of the Policy Event messagetype. This message type deals with security related topics such asauthorization/authentication issues between various system components.

Logging Policy Event messages define the logging characteristics of thesystem 10. Logging Policy Event messages can be used for debugging thesystem 10. The Logging Policy Event messages have various warning anderror levels defined.

Auditing Policy Event messages define various audit control messagesavailable to the different system 10 components and device/applicationsmanaged by the system 10.

Service Management Event messages are used to control various system 10services.

Device/Application Management messages are used to controldevice(s)/application(s) managed by the system 10. The way Managementmessages are sent is dependent on the device or application to which adevice adapter 12.2 is trying to communicate with. Management messagesmay or may not use the message bus to convey control information to thedevice or application being managed. Management messages assumecommunication between a device adapter 12.2 and the device/applicationbeing managed is guaranteed.

Feedback messages are used to provide real-time/near real-time feedbackinformation on a device/application that is being monitored by thesystem 10.

Generic messages are used to convey information outside the normalMessage Bus boundaries. These messages are used to communicate withoutside systems that require special means (gateways) to communicatebetween system 10 and foreign systems. Generic messages have aguaranteed delivery but have no time dependency attached to them.

Message Bus Behavior

A client object using the message bus client library is able to connectto a publish/subscribe channel. The client library provides eventhandles or thread safe callbacks to the calling program.

The message bus server keeps a log of all transactions/operations thatoccur regarding message routing within the message bus.

The message bus provides a mechanism that allows a system administratorto direct and define the logging/auditing levels to various componentswithin the system 10 in real-time.

The message bus server derives a time stamp from a time service. Allmessages created within the system 10 derive their time stamp eitherdirectly or indirectly from a time server running the time service.

The message bus includes an error reporting mechanism which reportsspecific error conditions relating to commands and operations carriedout by both the client/server side of any message bus operation.Notification of an error condition is provided to the clients andservers.

Errors are logged to a Logging Service, which allows the systemadministrator to perform a system message audit to determine where theerror originally occurred.

Client Plug-in

The system 10 includes a Client Plug-in which reside as a Dynamic LinkLibrary (DLL) or in component(s), which install locally on a client'sbrowser.

The client Plug-in is able to retrieve specific information from adigital certificate that uniquely identifies (authenticates) a userusing the browser connecting a system 10 network.

The Client Plug-in is activated when a web server sends a HypertextMarkup Language (HTML) page back to the browser with a specialMultipurpose Internet Mail Extensions (MIME) type embedded in an HTMLpage calling on the plug-in to start on the client's browser.

Users are able to administer a site list using commands such as add,delete, and modify. This allows the user to either add new sites orchange Uniform Resource Locator (URL) addresses when necessary.

In one embodiment of the invention, the Client Plug-in will be activatedonly if there is a secure SSL connection between the user's browser anda host server identified within the site list.

The state information for mapping a certificate to a site (URL) is keptin a secure (encrypted) file locally or is encrypted and stored withinthe client operating system registry for the browsers plug-in.

Client Plug-in Behavior

If a user is not connected to a system 10 controlled network, the clientplug-in will not be activated.

When visiting specific sites within the plug-in's site map list, theplug-in attaches a serial number, from a user's digital certificateresiding on the browser, to the front of an HTTP stream. This is used asa tag or (cookie) to uniquely identify each user within the system 10controlled network.

The client browser plug-in activation/request sequence according to oneembodiment of the invention will now be described with reference to FIG.13 of the drawings. Referring to FIG. 13 of the drawings, at (3) theclient browser requests a web page from the server. At (4) if the webpage requested is a secure page, the server sends a HTML page with MIMEtype EMBED to start the Client Plug-in on the client browser. The ClientPlug-in then verifies if the connection is over SSL. At (6) the ClientPlug-in appends the data read from the certificate (Serial Number) tothe HTTP stream and sends the request to the web server (The Netscapefunction NPN_PostURL method or a cookie may be used to send thisinformation). At (1) A certificate mapper component reads thecertificate mapped by the user. At (2) the certificate informationrequired for authentication is extracted and stored in a predefinedlocation. At (3) the client browser makes a request to a secure page onthe server. At (4) the server returns an HTML page with the plug-inEMBED to start the plug-in on the client browser. At (5) the plug-ingets the certificate information from the client's machine. At (6) thecertificate information and/or data (using PostURL to the server) isreturned. Finally, at (7) the server validates the certificate andreturns the requested page.

Policy Distribution

Policy is distributed in the system 10 along a chain defined by theglobal, regional and the local modelers. The global modeler breaks upand enforces policies upon subordinate modelers. This is done to addscalability and robustness and to reduce decision-making time.

Local modelers typically exist within an ARE. This is done to reduce thefootprint of an ARE when memory is in short supply, such as inside anetwork appliance.

FIG. 14 shows a bock diagram of an ARE 12 comprising a local modeler12.4 which enforces policy on policy modelers 12.5 which define thelowest level modeler. The policy modelers 12.5 reside inside an agentand enforces simply policies.

The modelers apply a policy calculus to decompose policies with simplepolicies.

Hierarchy Between Modelers

FIG. 15 shows the hierarchy of the various modelers within the system 10according to one embodiment of the invention. Referring to FIG. 15, thefirst level is the local level which is defined by local modelers 200,such as local modeler 12.4 in FIG. 14. The local modeler 200 comprises anumber of AREs 202 which control devices 204 or applications 206. EachARE communicates to a topology service 208. One level above the locallevel is a regional level which is defined by a regional modeler 210which controls a number of local modelers 200. The highest level in thesystem 10 is the global or enterprise level which defines a globalmodeler 212. The global modeler 212 communicates with a policy service214, the topology service 208, and a configuration service 216 tocoordinate overall policy interaction and control. An administrators GUI218 facilitates input to the global modeler 212. The administrators GUI218 comprises a policy GUI 220 whereby new policy or policy changes maybe input into the system 10. The administrators GUI 216 furthercomprises a configuration GUI 222 whereby configuration changes to thesystem 10 may be input.

FIG. 16 shows the configuration of the various modelers shown in FIG. 15in greater detail. Referring to FIG. 16, a monitoring agent 224 isstarted as described earlier. In one embodiment the monitoring agent 224has a sender component 224.1 which sends information about the operationand status of application 206 and device 204, respectively, to alistener component 224.2. Such information is in the form of feedbackmessages. In a second embodiment, information about an application 206or device 204 is obtained indirectly using a proxy sender 220.3/listener224.1 variation of a monitoring agent as previously described. As isshown in FIG. 16, the listeners 220.1 send policy event messages to acontroller 226. Each of the controllers 226 is defined by a policyagent/device adapter combination. The controllers 222 are able to sendmanagement messages to the applications 206 and devices 204 in order toexert control over these devices and applications. Each of thecontrollers 226 communicate with local modelers 200 by sending andreceiving policy event messages. Communication between the localmodelers 200 and a regional modeler 210 takes place via an exchange ofpolicy event messages. Likewise, policy event messages are exchangedbetween the regional modeler 210 and a global modeler 212. The globalmodeler is able to send generic system messages to a systemadministrator.

Thus, according to the embodiments described above, the local modelersform part of the Policy Decision Point (PDP) for agents and is concernedwith monitoring events and notifying register agents when to applyappropriate behavior based modifications to policy currently runningupon the agents.

FIG. 17 illustrates the decision making process used by the modelers,according to one aspect of the invention. Referring to FIG. 17 at 230feedback is received from a device or application being monitored. At232 the feedback is processed into a Policy Event Message (PEM) by amonitoring agent. At 234 the PEM is passed to a policy agent whichdecides at 236 whether it is endowed with the necessary policy in orderto effect the necessary corrective action based on the PEM. If the agentdecides that it has the necessary policy at 238 a determination is madeas to whether the policy has been violated. If the policy has beenviolated, at 240 corrective action is performed otherwise, at 242 theagent does nothing. If it is determined that the agent lacks thenecessary policy in order to take action, at 244 the PEM is passed to alocal modeler. At 246 the local modeler makes a decision as to whetherit is able to handle the PEM, by itself or whether it needs furtherpolicy. If it is decided that the modeler is able to handle the PEM, at248 a determination is made as to whether the policy has been violated.If it is determined that a policy violation has occurred, at 250 aninstruction is sent to the policy agent to perform corrective action.Alternatively, if policy has not been violated, at 252 the local modelerdoes nothing. If it is determined that the local modeler lacks thenecessary policy to take action, at 254 the PEM is passed to a regionalmodeler which makes a decision at 256 whether it has the necessarypolicy in order to take action. If it is decided that the regionalmodeler has the necessary policy to take action then at 258 adetermination is made as to whether the policy has been violated. If thepolicy has been violated, at 260 the regional modeler sends instructionswhich are filtered through the local modelers and is eventually receivedby an appropriate policy agent whereupon the policy agent takes thenecessary corrective action. If it is decided that no policy has beenviolated, at 262 the regional modeler does nothing. If a determinationis made that further policy is required by the regional modeler at 264the PEM is passed to the global modeler. At 266 the global modeler makesa decision as to whether policy has been violated. If policy has beenviolated, at 268 the global modeler sends instructions to a policyagent, which instructions eventually reach a relevant policy agent bypassing firstly through the regional modelers and then secondly throughone or more local modelers. The instructions are instructions which tellan agent to perform the necessary corrective action.

The local modelers are responsible for picking up change controlmessages from either a parent modeler or the topology service. The localmodelers monitor or listen on a range of well-known ports defined by asystem administrator or a client. If the port assignments are defined bya client, the port assignments need to be made at the global level forall the lower level modelers underneath to work correctly.

Characteristics of Local Modeler Behavior

As described the primary function of the local modelers is to deploy,coordinate, and control compound policy statements over multiple manageddevices within the system 10. Compound policy statements are policiesthat encompass more than one device/resource distributed within thesystem 10. It coordinates and develops strategies to cope with thevarious demands the network/application may make.

Each local modelers control state replication of agents which involvesreplicating the state of each agent or other modelers. This is done inorder to make the system 10 fault-tolerant. Each local modeler goesthrough a selection process to determine its nearest neighbor that itcan safely replicate by doing a ping and checking the return round tripvalues to and from the various local modelers within its functionaldomain. The modelers with the lowest round trip values are selected foragent replication.

Each local modeler controls only agents that are registered within itsoperational ARE or that are being proxied by the local ARE in which thelocal modeler resides.

Each local modeler coordinates policy between the various policy agentsdeployed upon an ARE. This includes coordinating policy resource sharingbetween devices and policy enforcement upon managed devices.

Each local modeler listens for feedback messages sent by devices it hasbeen setup to monitor. It can listen to devices and agents on otherARE's.

Each local modeler also listens for change control (notification)messages sent out by the topology service. These messages are sent bythe Topology Service to notify ARE's, and specifically agents when achange in their policy or configuration has occurred. The local Modelerlistens to these messages through a well know socket port.

If a policy is a complex policy, the local modeler breaks the policyinto a set of simpler policies using a policy calculus and applies themto policy agents or subordinate modelers.

Characteristics of Global Modeler Behavior

The global modeler forms a master scheduler/planner/architect for one ormore functional domains. It is responsible for analyzing, in real-time,a single functional domain and determining its optimal usage betweenvarious components given various global/regional/local policies appliedto the system 10. It does this in real-time and manages by sending outcommands to the local modelers.

The global modeler has a primary and secondary systems image of thesystem it is modeling and controlling. The primary systems imagerepresents the current state of a particular functional domain that theglobal modeler is controlling. The secondary systems image is used forbackup purposes and provides a means to test policies on a backup beforeimplementation.

Policy Structure

Policies contain within themselves directives to setup, enforce, andcoordinate actions within the network/application they are controlling.Policies know about the participants required to set up a feedback loopand to control and monitor the loop. In effect, policy can be thought ofas a coherent set of rules to administer, manage, and control access tonetwork/application resources.

Policies are applied to the topology service. According to oneembodiment of the invention, the policy service uses the topologyservice as a mechanism to link policy and configuration informationtogether for distribution within the system 10.

There are two types of policies within the system 10 viz. simple andcompound policies. A simple policy has the following form of an IFTest=True THEN, DO specific ACTION. A simple policy is evaluated in thisway; “IF” the condition evaluates to true by testing a variable, “THEN”a specific action (behavior) is applied to a device or application.

A compound policy consists of two or more simple or compound policiesgrouped together in a statement. A compound policy is evaluated by firstevaluating the highest level of containment (the parent compound policy)first, if the outer most condition(s) of the parent compound policyevaluates to true, then the subsequent child policies are evaluated byorder of their placement in the compound policy statement array, and bytheir priority (see below). If the outer most containment compoundpolicy conditions evaluates to false, then no other child evaluationsare performed, and the compound policy is evaluated to false.

Policies are prioritized by level of importance. Depending on thepriority level associated with a given policy, the policy can have aspecific evaluation order.

For example, a policy may be labeled with one of three designations; (1)mandatory, (2) recommended, and (3) don't care. Mandatory policy mustalways be attempted, recommended policy is attempted if at all possiblebut are not necessary, and the don't care is self-explanatory.

Policies are linked with configuration information to form a completepolicy. A policy that is linked to a specific configuration uses aunique hash key to map from one to the other. The key is used as areference to link both pieces of information together.

In one embodiment of the invention, policies are chained together toform a policy behavior chain. Multiple policies can be linked togetherto form behavior chains. A chaining field is defined within a policy,which holds comma delimited policy ID's used to specify which policyshould be called next and in what order they are to be executed, forexample, PolicyChainField: P12345678, P23456789, P34567890, etc . . . .

Policies vary in the level of abstraction from business-level expressionof QoS of service level agreements (SLA), to the specification of a setof rules that apply to a specific device/application on the network.Higher-level rules are defined for global and regional modelers. Theseare called domain level policies, and may have late binding variablesthat are unspecified, or specified by a classification, whereas thedevice-level rules used at the local modeler level have no latebindings.

Policy abstraction levels can be represented as services. Services areadministrative divisions of functionality.

As described, the global modeler includes a modeler 18.4 which in oneembodiment of the invention defines a policy refinery. The operation ofthe policy refinery will now be explained with reference to FIG. 18.Referring to FIG. 18 at 280 the refinery monitors/listens to eventswithin system 10. At 282 the refinery detects abnormalities in thesystem. At 284 the refinery creates one or more policies to try and fixthe abnormality. It bases these policies upon its experience withsimilar problems. At 284 the refinery applies these corrective policiesand at 290 the refinery observes the results of the corrective policies.The refinery learns by observing the new behavior of the system 10 andspecifically the effect of the corrective policies on the system. In oneembodiment of the invention, the refinery uses numerical methods todetermine appropriate modifications to policy. The numerical methods mayinclude the use of Kohonen Self Organizing maps and/or a Dijkstra SelfStabilization Algorithm. A Kohonen Self Organizing map is a neuralnetwork algorithm based on unsupervised learning. In another embodiment,the refinery, uses predictive algorithms to predict the failure of anetwork device and to determine appropriate corrective policy ahead ofthe failure. At 292 a determination is made as to whether the problem orabnormality has been remedied. If it is determined that the problemstill persists, at 288 the refinery creates further corrective policy oradjusts previously applied corrective policy and step 286 is performedagain. If the problem has been fixed then at 294 the refinery updatesits knowledge base with information regarding how to fix a similarproblem in future.

Services

Services are abstractions of policy or administrative divisions offunctionality. Viewed differently, a service is the work performed oroffered by a server. As will be appreciated, system 10 may have variousservices according to different embodiments of the invention. Themessage service has already been described. Each of the services has oneor more associated servers running the service. What follows is adescription of services used in one embodiment of the system 10.

Policy Service

The policy service sends policy information to the topology service. Thepolicy service links policy and configuration information together by ahash key generated by the policy service.

An individually managed device can have many policies applied to it atthe same time. This refers to the fact that each policy can configure aportion of a devices overall functional capability. Policy can overlapwith other policies and the policy service determines when a policy orcombinations of policies is invalid.

A new policy is considered invalid if it affects by direct or indirectaction other policies already in existence. This can be caused by, forexample, two or more policies sharing the same resource; or byimplementing a policy that causes an indirect impact by diverting systemresources away from existing enforced policies, resulting in overcompensation and serious side effects to the system as a whole.

Policies follow a hierarchical distribution chain or tree. This meansARE's that are above it within the tree chain hierarchy can hold policyinformation which a child node can request.

Policy Deployment Mechanics

Policy Service-to-Topology service: In the policy deployment sequence,policies either new/existing from the policy service are first sent tothe to the topology service. The topology service holds a tree thatdefines a logical management hierarchy for the system 10.

Topology Service-to-ARE: Within the topology service, the policies thatare active currently reflect the state of the network as controlled andmanaged by the system 10 for that particular functional domain. Each AREafter startup/initialization, contacts the topology service to determinewhich local modeler is assigned to its functional domain. Thereafter theARE obtains the relevant policies/configurations from the topologyservice and instantiates appropriate agents to handle the policy orconfiguration demands. If the ARE is unable to start an agent, it sendsa response back to the topology service indicating there was a failureto start the agent on the ARE.

ARE-to-local modeler: When a policy (agent) is instantiated upon an ARE,the ARE registers the policy with the local modeler assigned to itsfunctional domain.

ARE-to-Agent: After connecting to the topology service and determiningwhat policies (agents) are to be started. The ARE goes aboutinstantiating an agent within the ARE as defined by the topology servicemanagement tree. Part of instantiating an agent includes notifying theappropriate modeler that a new policy has been applied within the ARE.

Agent (Policy)-to-Device (Direct): The agent (policy) running on the AREfirst attempts to issue commands through its device adapter. The firstcommand issued is a status command that shows that the communicationchannel between the device and the agent's device adapter is working andthe device is in a operational state.

Agent-to-Application (Direct): The agent (policy) running on the AREfirst attempts to issue a request to the application through its deviceadapter. The initial request consists of an information status commandused to determine if the virtual communications channel is open betweenthe agent (policy) and the application/service through the agent'sdevice adapter.

Agent-to-ARE-to-ARE-to-Agent-to-Device (Proxied): This defines the proxyprocess that occurs when one agent cannot fulfill a policy agent requeston a particular ARE and must seek another ARE to host the request. Ifthe ARE is unable to start the agent, it sends a request back to thetopology service requesting another ARE close by to proxy the agent. Thetopology service sends a request to a neighboring agent to determine ifit can host the agent. If the neighboring ARE can, it sends back anacknowledgement to the topology service and the request is then handledon the proxied ARE. Messages sent to the original agent are now alldirected to the proxied ARE/Agent. The policy is enforced by the agentand communicated to the actual device through the Device adapterinterface.

Agent-to-ARE-to-ARE-to-Agent-to-Application (Proxied): This defines theproxy process that occurs when one agent cannot fulfill a policy agentrequest on a particular ARE and must seek another ARE to host therequest. If the ARE is unable to start the agent, it sends a requestback to the topology service requesting another ARE close by to proxythe agent. The topology service sends out a request to a neighboring AREto determined if it can host the agent. If the proxy ARE can, it sendsback an acknowledgment to the topology service and the request is thenhandled on the proxied ARE. Messages sent to the original agent are nowall directed to the proxied ARE/Agent. The policy is enforced by theagent and communicated to the actual application through its Deviceadapter.

Local modeler-to-Topology service: The local modeler notifies willnotify the topology service when an agent running on an ARE is no longerwithin SLA compliance of an active policy within the system 10.

Topology Service-to-Policy Service: The topology service communicateswith the Policy Service when a policy being administered by an agent isno longer able to stay within an SLA range. The local modeler signals tothe topology service that the agent is no longer able to adequatelyservice the policy (SLA) and that a new policy is needed. The topologyservice signals the policy service that the policy is no longer adequateand that a new policy is needed. At this point the global modeler wouldthen take over and determine the best course of action needed or thepolicy service would check to see if any policies are chained to theexisting failed policy.

Policy rules have an implicit context in which they are executed. Forexample, the context of a policy rule could be all packets running on aninterface or set of interfaces on which the rule is applied. Similarly,a parent rule provides a context to all of its sub-rules. The context ofthe sub-rules is the restriction of the context of the parent rule tothe set of cases that match the parent rule's condition clause.

The relationship between rules and sub-rules is defined as follows: Theparent rule's condition clause is a pre-condition for evaluation of allnested rules. If the parent rule's condition clause evaluates to“false”, all sub-rules are skipped and their condition clauses are notevaluated. If the parent rule's condition evaluates to “true”, the setof sub-rules are executed according to priority. If the parent rule'scondition evaluates to “true”, the parent rule's set of actions isexecuted before execution of the sub-rules actions. A default action isone that is to be executed only if none of the more specific sub-rulesare executed. A default action is defined as an action that is part of acatchall sub-rule associated with the parent rule. The associationlinking the default action(s) in this special sub-rule has the lowestpriority relative to all other sub-rule associations.

Topology Service

The topology service ties together both the configuration and policy toa specific device. The topology service acts as a central point withinthe system that defines the overall system control state and whatpolicies and configurations are defined for each controllable devicebeing managed by the system 10.

The topology service notifies ARE's of changes in its structure bysending out a broadcast or multicast message embedded with informationabout configuration changes. The ARE then alerts an agent that thetopology state has changed. The agent then updates itself with newpolicy to bring itself back into compliance with the topology servicessystem state.

The topology service acts in a passive manner. It does not actively seekout ARE/Agents that are out of synchronization with its network systemstate. It follows a request/response stateless paradigm similarly usedwithin the HTTP protocol.

Configuration Service

The configuration service stores configurations for each policy used ona managed device within the system 10. The configuration service storesinformation required to carry out a policy on a particular device. Thisinformation is device specific making each configuration unique to eachdevice or class of devices.

An individually managed device can have many configurations applied toit at the same time. This is because each policy(s) can configure aportion of a devices overall functional capability. The specificlibraries and classes needed for a particular policy-configuration arestored within the configuration service server. Alternatively, the linksto the classes and libraries needed are moved. The configuration servicestores hardware/software specific commands that can be used to interactwith the device/application that is to be controlled by the system 10.The configuration service will initialize and make known on a well-knownport the configuration for ARE's running, within the system 10.

Security Service

The system includes a security service to implement security measures.The security service includes an authorization module which defines acheckpoint for incoming requests from sources outside the system 10which is used to monitor/defend against un-wanted and un-authorizedintrusions into the system 10. The authorization module resides on anedge or along the outer perimeter of a network managed by system 10. Theauthorization module assumes that the transmission of data between thesender (Users Browser) and the receiver (authorization module) uses someform of encryption. As described above, in one embodiment the system 10uses SSL encryption between the sender and receiver. The authorizationmodule validates user, system and service level requests. The validationand checking of requested actions is performed using an access controllist (ACL) along the edge boundary point. The ACL list is used to manageand control the permissions of session related requests within thesystem 10. Only authorized users within the system 10 have authority togrant ACL rights to system resources. An application request is firstidentified at the edge of the system 10 before it is allowed to proceedto its destination inside the system 10. The incoming request is scannedfor an identifying digital certificate, which is appended or embedded tothe incoming request by the sender. The incoming request may, forexample, be an HTTP request. The sender is required to attach a digitalcertificate to the front of each incoming request into the system 10using an SSL connection. The authorization module parses and strips thecertificates off each incoming request and compares it to an in-memorycache within the authorization module. If there is a match within thecache, the request is validated and passed on to its destination and thesession window timer is reset. If there is no match within the cache, aquery is then made from the Authorization Module to a data store todetermine if the certificate is valid. If the certificate is found inthe data store, a session object is created along with the ACL list. Asession window timeout timer is then started and the request is allowedinto system 10. If the certificate cannot be validated or found withinthe datastore or there is no certificate to the request, the request isconsidered invalid and ignored. If a significant number of requests thatare deemed invalid are from the same IP address, an administrator canset a filter so that IP addresses from bad requesters are added to a badIP address list and so that these IP addresses can be ignored by thesystem 10 in future. Users are identified from outside the system 10using a certificate generated at the site using the system 10. Usuallythe site is that of a company which then assumes responsibility to thecertificate to outside customers. The certificate authentication serveris usually a third party application which is used to authenticate thecertificate used by the incoming request.

Session Service

The system 10 defines a session as a request window from a sourceoutside the system 10 that has a specific time window in which actionscan be performed and service level requests handled. A session holdsinformation regarding the requesting clients permissions within thesystem 10. This information consists of Access Control Lists (ACL)relating to devices (servers) and applications the requesting client haspermission to use. A session also holds references to workflows usedwithin a session time window. These workflows are temporary holdingstorage places for intermediate data generated by a transaction betweena client session and a backend service within system 10. A sessionservice starts a session by starting when a user logs into system 10 andissues a command. The expiration time defines a time window within whichuser actions are accepted. Actions outside this window are consideredinvalid. A valid time window is defined for a specific interval of time,which the a system administrator defines. The time window interval isreset and a new expiration count down clock is started for the usersession window when the user sends another request within the existinguser session time interval. A session is non-localized in the sense thatany point within the system 10 can be used as an entry point for asession. The session information is distributed between the topologyservices so as to provide a global access point for session requests.

Workflow Service

The system 10 uses a workflow service which is a transaction-processing(TP) Monitor that provides a temporary storage facility for intermediatetransaction data. The workflow service is used to provide optimalbandwidth and resource usage between different TP components within thesystem 10. It acts as a fault-tolerant temporary storage area for databeing processed over the web. The workflow service defines workflowobject comprising of a data cache that holds intermediate data beingprocessed for a transaction monitored by the system 10.

Time Service

The system 10 uses a time service to time stamp and synchronize messagesand events within the system 10. The time service provides a basicutility library of API's to help in coordination of system events.Support for single, periodic, and custom time events are provided. TheTime Service is responsible for synchronizing the various componentsusing point-to-point messages and broadcast events within the system.

Specific Implementation

FIG. 19 shows an implementation 300 of a system 10 in accordance withthe invention, using CORBA. The implementation 300 includes a manageddevice 302 which is managed or controlled by an ARE 304. The ARE 304includes a security manager 304.1 and a class loader 304.2. A monitoringagent 306 monitors for feedback from device 302. A policy agent 308issues commands and communicates with the device 302 via a deviceadapter 310. The ARE 304 obtains configuration information at runtimefrom a configuration service 312 during an auto discovery phase in whichthe ARE 304 is updated with system configuration information. A CORBAsession is established between the ARE 304 and a CORBA naming service

In particular the system 10 defines a session as a request window from asource outside the system 10 control domain that has a specific timewindow in which actions can be performed and service level requestshandled.

The session holds information regarding the requesting clientspermissions within the system 10. This information comprises AccessControl Lists (ACL) relating to devices (Servers) and applications therequesting client has permission to use.

A session object is created to hold references to objects that are usedwithin a session time window. These objects are temporary holdingstorage places for intermediate data generated by a transaction betweenthe client session and some backend service within domains. Anexpiration timer is started when an agent activates and issues acommand. Agent interactions are only accepted within this valid timewindow. A valid time window is defined for a specific interval of time,which a system administrator defines. The time window interval is resetand a new expiration count down clock is started for the user sessionwindow when the user sends another request within the existing usersession time interval. A session is non-localized in the sense that anyagent within the domain can be approached as an entry point for asession. The session information will be distributed between all theTopology Services running so as to provide a global access point forsession requests. The naming service 314 saves information to apersistent data store 316 which is backed up into a back-up persistentdata store 318. The implementation 300 includes a policy configurationdata storage 320 which is accessed by a policy service and a policyconfiguration service 324. Each of the policy service 322 and policyconfiguration service 324 communicate and exchange information with atopology service 326. The topology service 326 is able to broadcastsockets based broadcasts/alerts to the ARE 304. To alert the ARE 304 ofchanges in network policy. Topology information stored in the topologyserver 326 is replicated in a data store 328.

FIG. 20 shows another implementation 350 of a system 10 in accordancewith the invention. The implementation 350 is similar to theimplementation 300 and accordingly the same reference numerals have beenused to describe the same or similar features. The main differencebetween implementation between 350 and implementations 300 is thatimplementation 350 utilizes an LDAP server 352 which is accessed byclass loader 304.2. The LDAP server 352 is replicated in LDAP server 354and in one or more federated LDAP servers 358.

In implementations 300 and 350 other services necessary for theoperation of system 10, which services have already been described, havenot been included for the sake of clarity. However, a reader skilled inthe art will appreciate how these services relate to implementations 300and 350.

In the above description reference was made to servers at variousplaces. For example, each of the services was described as running onone or more servers. FIG. 21 shows various components making up a server300 in accordance with one embodiment of the invention. Referring toFIG. 21 it will be seen that system hardware 300 includes a memory 304,which may represent one or more physical memory devices, which mayinclude any type of random access memory (RAM) read only memory (ROM)(which may be programmable), flash memory, non-volatile mass storagedevice, or a combination of such memory devices. The memory 302 isconnected via a system bus 310 to a processor 302. The memory 304includes instructions 306 which when executed by the processor 302 causethe processor to perform the methodology of the invention or run one ormore services as discussed above. Additionally, the system 300 includesa disk drive 306 and a CD ROM drive 308 each of which is coupled to aperipheral-device and user-interface 312 via bus 310. Processor 302,memory 304, disk drive 306 and CD ROM 308 are generally known in theart. Peripheral-device and user-interface 312 provide an interfacebetween system bus 310 and various components connected to a peripheralbus 316 as well as to user interface components, such as display, mouseand other user interface devices. A network interface 314 is coupled toperipheral bus 316 and provides network connectivity to system 300.

For the purposes of this specification, a machine-readable mediumincludes any mechanism that provides (i.e. stores and/or transmits)information in a form readable by a machine (e.g. computer) for example,a machine-readable medium includes read-only memory (ROM); random accessmemory (RAM); magnetic disk storage media; optical storage media; flashmemory devices; electrical, optical, acoustical or other form ofpropagated signals (e.g. carrier waves, infra red signals, digitalsignals, etc.); etc.

It will be apparent from this description the aspects of the presentinvention may be embodied, at least partly, in software. In otherembodiments, hardware circuitry may be used in combination with softwareinstructions to implement the present invention. Thus, the techniquesare not limited to any specific combination of hardware circuitry andsoftware.

Although the present invention has been described with reference tospecific exemplary embodiments, it will be evident that variousmodification and changes can be made to these embodiments withoutdeparting from the broader spirit of the invention as set forth in theclaims. Accordingly, the specification and drawings are to be regardedin an illustrative sense rather than in a restrictive sense.

Further, particular methods of the invention have been described interms of computer software with reference to a series of flowcharts. Themethods to be performed by a computer constitute computer programs madeup of computer-executable instructions illustrated as blocks (acts).Describing the methods by reference to a flowchart enables one skilledin the art to develop such programs including such instructions to carryout the methods on suitably configured computers (the processing unit ofthe computer executing the instructions from computer-readable media).The computer-executable instructions may be written in a computerprogramming language or may be embodied in firmware logic. If written ina programming language conforming to a recognized standard, suchinstructions can be executed on a variety of hardware platforms and forinterface to a variety of operating systems. In addition, the presentinvention is not described with reference to any particular programminglanguage. It will be appreciated that a variety of programming languagesmay be used to implement the teachings of the invention as describedherein. Furthermore, it is common in the art to speak of software, inone form or another (e.g. program, procedure, process, application,module, logic . . . ), as taking an action or causing a result. Suchexpressions are merely a shorthand way of saying that execution of thesoftware by a computer causes the processor of the computer to performan action or a produce a result. It will be appreciated that more orfewer processes may be incorporated into the methods as described abovewithout departing from the scope of the invention, and that noparticular order is implied by the arrangement of blocks shown anddescribed herein.

EXAMPLES

Specific examples of system 10 components described above are nowprovided.

(a) In a JAVA embodiment, the messaging adapter 112 may be described bythe following class definitions:

class SendQueue super actor /* sending messages is an action */ methodsSend +<Rcp> [+<Ref>] +<Msg-type> +<Content> {...} class ReceiveQueuesuper sensor /* receiving messages is a sensing activity */ methodsrec_s ?<Sdr> ?<Msg-type> ?<Content> {...} rec_a ?<Sdr> ?<Msg-type>?<Content> [+<Timeout>] {...}These definitions provide two functions which allow an agent 110 toreceive messages from other agents viz. rec_s and rec_a. rec_s is afunction that waits synchronously until a message has been received. Thearguments denote (from left to right) the sender of the message, messagetype, and actual message content. If the arguments are provided withvalues, only messages matching the parameter descriptions are returned.rec_a looks for messages asynchronously and fails if no matchingmessages have been received. rec_a has an optional time-out parameterallowing it to specify a time interval during which the message queue ismonitored for matching messages. The default value for this timeinterval may be set to zero.

(b) Each message may be represented as a tuple:

-   -   Msg=(Id, Sender, Recipient, Reference, Type, Content), in which    -   Id=Unique message identifier;    -   Sender=mnemonic of Sender of message (IP address, IOR, etc.);    -   Recipient=mnemonic of Recipient of message (IP address, IOR,        etc.);    -   Reference=reference to a message Id (this is an optional field        and used if a hierarchical message layout is desired;    -   Type=the taxonomic identifier of this message and        -   Content=the actual message itself

(c) The following schema declaration may be used to implement a PKBpolicy repository:

[ policy ( name: PolicyName ) relation ( name: RelationName domain:PolicyName₁#...# PolicyName_(n) ) attribute( name: AttributeNamepolicyt: PolicyName type: Type ) default ( name: AttributeName value:DefaultValue ) feature ( name: FeatureName policy: PolicyName type: Typeinit: Init)... ]

(d) The assertional, retrieval and active information services describedabove are defined below:

PKB Assertional Services

createKBObj(Id) returns a unique identification of a newly created KBobject. createPolicyObj(Id, creates an instance of a concept denoted byPolicy) concept and binds it to the object identified by IdcreateRelation(IdList, defines an instance of a new relation Relation)denoted by Relation Rel among the concept instances denoted by theobject identifiers in IdList. The ordering of the members of IdListdetermines their ordering in the relation. setValue(Id, Attribute,assigns the value denoted by newValue to the newValue) attribute of theconcept instance denoted by Id. deleteKBObj(Id) delete an object;deleting an object that is bound to a concept instance deletes theconcept instance and all instances of relations having this conceptinstance as a member. DeletePolicy Deleted the instance of a Policydenoted by IdList deleteRelation(IdList, Deletes the instance of aRelation denoted by Relation) IdList removeValue(Id, removes the valuefor the attribute of the concept Attribute) instance denoted by Id.

PKB Retrieval Services

returnPolicy(PolicyList, IdList) Returns a list of all instances ofPolicy returnPolicyBool(Id, Policy, bool) Returns true if the policydenoted by Id is a member of the Policy Knowledge BasereturnRelMembers(relation, Returns a list of policy ListofId) instancesdenoting all tuples that define the relation desiredreturnMemberBool(IdList, Returns true if the tuple Relation, IdList)denoted by IdList is a member of the Relation deleteKBObj(Id) delete anobject; deleting an object that is bound to a concept instance deletesthe concept instance and all instances of relations having this conceptinstance as a member.PKB Information/Versioning services

versionPolicy(Policy, Causes any modification of policy instancesSource, Destination, Id) to be sent to the destination addressspecified - if requested versionRelation(Relation, Causes anymodification of relation Source, Destination, Id) instances to be sentto the destination address specified - if requested

(e) The control unit 118 may use a PolicyMethodActor Class such as theone shown below

Class PolicyMethodActor

attributes Name Type /* atomic or continuous */ Range /* admissibleinput values */ methods calibrate +<Name> {...} execute +<Name>+<Params> {...} /* for Type = atomic */ activate +<Name> +<Params> {...}/* for Type = continuous */ suspend +<Id> {...} /* for Type = continuous*/ deactivate +<Id> {...} /* for Type = continuous */

(f) In the example below, “class Layer” defines the fundamental activitycycle of the control unit 118:

 1 class Layer  2 attributes  3 Higher, Lower /* neighboring layers */ 4 Policy /* current beliefs, goals, intentions */  5 Sit, Sit-desc /*situations and situation descriptions */  6 Ops /* operationalprimitives available */  7 Actreq /* activation requests from layer i−1*/  8 Commitment /* commit messages received from layer i+1 */  9 [. ..] 10 methods 11 policyUpdate +<policy> {...}  /* policy update function*/ 12 sitRec +<Sit> +<policy> +<Sit-desc> {...}  /* sit. recognition. */13 g-act +<Sit> +<Goals> {...}  /* goal activation funct. */ 14policyCheck +<Sg> +<Ops> {...}  /*policy checking funct.*/ 15 op-select+<Sg> +<Ops> {...}  /* planning function */ 16 schedule +<Int> +<Ints>{...}  /* scheduling function */ 17 execute +<Int> {...} /* schedulingfunction */ 18 [...] 19 cycle 20 Policy = policyUpdate(Policy); 21 Sit =s-rec(Sit ∪ Actreq, Policy, Sit-desc) 22 Goals = g-act(Sit, Goals); 23Sg = {(S, G) I S ∈ Sit ΛGg ∈ Goals Λ G = g-act(S,_)}; 24(Comp-sg,Nocomp-sg)=policyCheck(Sg, Ops); 25 foreach Sg′ ∈ Nocomp-sg /*shift control to higher layer */ 26 Higher←receive(request,activate(Sg′)); 27 Int = op-select(Comp-sg, Ops); 28 Int = schedule(Int,Commitment); 29 Int = execute(Int);

(g) The Domain Server class defined below may be used to implement thedomain adapter 115 described with reference to FIG. 7.

class DomainSensor attributes Name Value Range methods calibrate +<Name>{...} enable +<Name> {...} disable +<Name> {...} get-val +<Name>{...}

(h) The perception buffer may be defined by a Perception Buffer Classdefined below:

Class PerceptionBuffer methods init {...} clear {...} get-val+<DomainSensor.name> {...} get-all {...} refresh +<DomainSensor.name>{...} refresh-all {...}

(i) Services are represented in Interface Description Language, i.e. ORB(IDL) as follows:

module policy { struct ServiceOperationParameter { stringclassParameterName; string parameterType; long parameterNumber;::com::netfuel::common::structures::ManagedObjectserviceOperationParameterManagedObject; };typedefsequence<ServiceOperationParameter>ServiceOperationParameterList; struct ServiceOperation { stringclassName; string classOperationName; string operationType; longnumberOfParameters; ::com::netfuel::common::structures::ManagedObjectserviceOperationManagedObject; ServiceOperationParameterListserviceOperationServiceOperationParameterList; }; typedefsequence<ServiceOperation> ServiceOperationList; struct ServiceType {long long parentServiceTypeId;::com::netfuel::common::structures::ManagedObjectserviceTypeManagedObject; }; typedef sequence<ServiceType>ServiceTypeList; struct Service { string className; ServiceTypeserviceServiceType; ::com::netfuel::common::structures::ManagedObjectserviceManagedObject; ServiceOperationList serviceServiceOperationList;long long parentServiceId; }; typedef sequence<Service> ServiceList; };

A policy can be represented in IDL as follows:

module policy { struct PolicyOperationParameter { long parameterNumber;string parameterValue; }; typedefsequence<PolicyOperationParameter>PolicyOperationParameterList; struct PolicyAction { longactionOrderNumber; ::com::netfuel::common::structures::ManagedObjectpolicyActionManagedObject;::com::netfuel::policy::structures::ServiceOperationpolicyActionServiceOperation; PolicyOperationParameterListpolicyActionOperationParameterList; }; typedef sequence<PolicyAction>PolicyActionList; struct PolicyCondition { string evaluationOperator;string evaluationValue; long long evaluationPeriod; booleanconditionNegated; ::com::netfuel::common::structures::ManagedObjectpolicyConditionManagedObject;::com::netfuel::policy::structures::ServiceOperationpolicyConditionServiceOperation; PolicyOperationParameterListpolicyConditionOperationParameterList; }; typedefsequence<PolicyCondition> PolicyConditionList; structPolicyConditionGroup { long groupOrderNumber;::com::netfuel::common::structures::ManagedObjectpolicyConditionGroupManagedObject; PolicyConditionListpolicyConditionGroupPolicyConditionList; }; typedefsequence<PolicyConditionGroup> PolicyConditionGroupList; structPolicyTimePeriodCondition { ::com::netfuel::common::structures::DateTimebeginDate; ::com::netfuel::common::structures::DateTime endDate;::com::netfuel::common::structures::ManagedObjectpolicyTimePeriodConditionManagedObject; ); typedefsequence<PolicyTimePeriodCondition> PolicyTimePeriodConditionList;struct PolicyActionGroup { long actionGroupOrderNumber;::com::netfuel::common::structures::ManagedObjectpolicyActionGroupManagedObject; PolicyActionListpolicyActionGroupPolicyActionList; }; typedefsequence<PolicyActionGroup> PolicyActionGroupList; struct PolicyRule {string enabled; long priority; boolean mandatoryEvaluation; stringsequencedActionType; string conditionListType;::com::netfuel::common::structures::ManagedObjectpolicyRuleManagedObject; PolicyTimePeriodConditionListpolicyRulePolicyTimePeriodConditionList; PolicyConditionGroupListpolicyRulePolicyConditionGroupList; PolicyActionGroupListpolicyRulePolicyActionGroup; }; typedef sequence<PolicyRule>PolicyRuleList; struct Policy {::com::netfuel::common::structures::ManagedObject policyManagedObject;PolicyRuleList policyPolicyRuleList; }; typedef sequence<Policy>PolicyList; }; // end module policy

1. A method of managing a computer network, comprising: assigning a goalto a software agent, wherein the goal is a programmatic expression of apredefined task for the software agent; and dynamically modifying theassigned goal of the software agent according to a desired operationalcharacteristic of the computer network.
 2. The method of claim 1,wherein the assigned goal of the agent is expressed as a policy.
 3. Themethod of claim 1, further comprising: obtaining information about anetwork component by the software agent in performing the predefinedtask; and constructing a topological representation of the computernetwork from the information.
 4. The method of claim 1, furthercomprising: monitoring an operational characteristic of the network bythe software agent in performing the predefined task; and determining anappropriate modification to the assigned goal based on the monitoringand the desired operational characteristic.
 5. The method of claim 4,wherein the determining uses a numerical method.
 6. The method of claim5, wherein the numerical method comprises Kohonen Self Organizing maps.7. The method of claim 5, wherein the numerical method comprises aDijkstra Self Stabilization Algorithm.
 8. A computer network,comprising: a software agent having an assigned goal which is aprogrammatic expression of a predefined task for the software agent; anagent support mechanism to provide support to the agent; and a networkcontrol mechanism operable to dynamically modify the assigned goal ofthe agent based on a desired operational characteristic of the network.9. The computer network of claim 8, wherein the software agent comprisesa discovery agent having the assigned goal to discover information abouta network component.
 10. The computer network of claim 9, wherein thenetwork control mechanism constructs a topological representation of thenetwork from the information.
 11. The computer network of claim 8,wherein the software agent comprises a monitoring agent having theassigned goal to monitor an operational characteristic of the network.12. The computer network of claim 8, wherein the assigned goal of anagent is expressed as a policy.
 13. The computer network of claim 8,wherein the network control mechanism comprises a communicationsmechanism to facilitate communications with agents.
 14. The computernetwork of claim 13, wherein the communications mechanism comprises asecure communications protocol.
 15. The computer network of claim 14,wherein the secure communications protocol encrypts a payload of a datapacket.
 16. The computer network of claim 14, wherein the securecommunications protocol utilizes multiple data channels to transmit datapackets.
 17. The computer network of claim 16, wherein the data channelsare randomly changed.